37 research outputs found

    Blind men and elephants: What do citation summaries tell us about a research article?

    Full text link
    The old Asian legend about the blind men and the elephant comes to mind when looking at how different authors of scientific papers describe a piece of related prior work. It turns out that different citations to the same paper often focus on different aspects of that paper and that neither provides a full description of its full set of contributions. In this article, we will describe our investigation of this phenomenon. We studied citation summaries in the context of research papers in the biomedical domain. A citation summary is the set of citing sentences for a given article and can be used as a surrogate for the actual article in a variety of scenarios. It contains information that was deemed by peers to be important. Our study shows that citation summaries overlap to some extent with the abstracts of the papers and that they also differ from them in that they focus on different aspects of these papers than do the abstracts. In addition to this, co-cited articles (which are pairs of articles cited by another article) tend to be similar. We show results based on a lexical similarity metric called cohesion to justify our claims.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/57540/1/20707_ftp.pd

    Continuous low-dose antibiotic prophylaxis for adults with repeated urinary tract infections (AnTIC): a randomised, open-label trial

    Get PDF
    Funder: UK National Institute for Health Research. Open Access funded by Department of Health UK Acknowledgments We thank all the participants for their commitment to the study, Sheila Wallace for updating the systematic review, members of the Trial Steering Committee and members of the Data Monitoring Committee for their valuable guidance. We thank the National Health Service organisations, principal investigators and local research staff who hosted and ran the study at site. We thank the Health Technology Assessment Programme of the UK NIHR for funding the study (no. 11/72/01). The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the UK Government Department of Health. A full report of the study30 has been published by the NIHR Library.Peer reviewedPublisher PD

    Unsupervised Paraphrasing via Deep Reinforcement Learning

    Full text link
    Paraphrasing is expressing the meaning of an input sentence in different wording while maintaining fluency (i.e., grammatical and syntactical correctness). Most existing work on paraphrasing use supervised models that are limited to specific domains (e.g., image captions). Such models can neither be straightforwardly transferred to other domains nor generalize well, and creating labeled training data for new domains is expensive and laborious. The need for paraphrasing across different domains and the scarcity of labeled training data in many such domains call for exploring unsupervised paraphrase generation methods. We propose Progressive Unsupervised Paraphrasing (PUP): a novel unsupervised paraphrase generation method based on deep reinforcement learning (DRL). PUP uses a variational autoencoder (trained using a non-parallel corpus) to generate a seed paraphrase that warm-starts the DRL model. Then, PUP progressively tunes the seed paraphrase guided by our novel reward function which combines semantic adequacy, language fluency, and expression diversity measures to quantify the quality of the generated paraphrases in each iteration without needing parallel sentences. Our extensive experimental evaluation shows that PUP outperforms unsupervised state-of-the-art paraphrasing techniques in terms of both automatic metrics and user studies on four real datasets. We also show that PUP outperforms domain-adapted supervised algorithms on several datasets. Our evaluation also shows that PUP achieves a great trade-off between semantic similarity and diversity of expression

    Symmetric Weighted First-Order Model Counting

    Full text link
    The FO Model Counting problem (FOMC) is the following: given a sentence Φ\Phi in FO and a number nn, compute the number of models of Φ\Phi over a domain of size nn; the Weighted variant (WFOMC) generalizes the problem by associating a weight to each tuple and defining the weight of a model to be the product of weights of its tuples. In this paper we study the complexity of the symmetric WFOMC, where all tuples of a given relation have the same weight. Our motivation comes from an important application, inference in Knowledge Bases with soft constraints, like Markov Logic Networks, but the problem is also of independent theoretical interest. We study both the data complexity, and the combined complexity of FOMC and WFOMC. For the data complexity we prove the existence of an FO3^{3} formula for which FOMC is #P1_1-complete, and the existence of a Conjunctive Query for which WFOMC is #P1_1-complete. We also prove that all γ\gamma-acyclic queries have polynomial time data complexity. For the combined complexity, we prove that, for every fragment FOk^{k}, k≥2k\geq 2, the combined complexity of FOMC (or WFOMC) is #P-complete.Comment: To appear at PODS'1

    GIANT: Scalable Creation of a Web-scale Ontology

    Full text link
    Understanding what online users may pay attention to is key to content recommendation and search services. These services will benefit from a highly structured and web-scale ontology of entities, concepts, events, topics and categories. While existing knowledge bases and taxonomies embody a large volume of entities and categories, we argue that they fail to discover properly grained concepts, events and topics in the language style of online population. Neither is a logically structured ontology maintained among these notions. In this paper, we present GIANT, a mechanism to construct a user-centered, web-scale, structured ontology, containing a large number of natural language phrases conforming to user attentions at various granularities, mined from a vast volume of web documents and search click graphs. Various types of edges are also constructed to maintain a hierarchy in the ontology. We present our graph-neural-network-based techniques used in GIANT, and evaluate the proposed methods as compared to a variety of baselines. GIANT has produced the Attention Ontology, which has been deployed in various Tencent applications involving over a billion users. Online A/B testing performed on Tencent QQ Browser shows that Attention Ontology can significantly improve click-through rates in news recommendation.Comment: Accepted as full paper by SIGMOD 202

    Open Question Answering

    No full text
    Thesis (Ph.D.)--University of Washington, 2014For the past fifteen years, search engines like Google have been the dominant way of finding information online. However, search engines break down when presented with complex information needs expressed as natural language questions. Further, as more people access the web from mobile devices with limited input/output capabilities, the need for software that can interpret and answer questions becomes more pressing. This dissertation studies the design of Open Question Answering (Open QA) systems that answer questions by reasoning over large, open-domain knowledge bases. Open QA systems are faced with two challenges. The first challenge is knowledge acquisition: How does the system acquire and represent the knowledge needed to answer questions? I describe a simple and scalable information extraction technique that automatically constructs an open-domain knowledge base from web text. The second challenge that Open QA systems face is question interpretation: How does the system robustly map questions to queries over its knowledge? I describe algorithms that learn to interpret questions by leveraging massive amounts of data from community QA sites like WikiAnswers. This dissertation shows that combining information extraction with community-QA data can enable Open QA at a much larger scale than what was previously possible
    corecore